{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "498d70fa",
   "metadata": {},
   "source": [
    "**<font color = black size=6>实验三:逻辑回归</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa5c5483",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第一部分:函数介绍</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "869f21ce",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">机器学习所使用的数据集并不一定能够直接进行使用，如可能出现缺失值或者异常值（如异常0或者异常大的数据），又或者是数据类型不适合直接应用于模型训练等。因此，数据预处理是机器学习过程中十分重要的一个部分。<br>本次实验包括数据预处理，包括判断数据是否有缺失值以及异常值、缺失值替换操作、并将部分特征值类型转换为适合模型训练的类型。在进行逻辑回归实验之前，首先介绍下上述操作所用到的python语法。</span>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3957a94a",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 判断一列中是否有缺失值</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5472add3",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import math\n",
    "import matplotlib as mpl\n",
    "# ignore warnings\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc6837f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "#通过pandas将csv文件转换为dataframe格式后进行操作\n",
    "train1_frame = pd.read_csv('introduction.csv')\n",
    "#该数据描述了不同性别的人的身高和体重数据\n",
    "print(train1_frame)\n",
    "#打印出数据，观察数据异常值情况"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "029c5ac0",
   "metadata": {},
   "outputs": [],
   "source": [
    "#可发现数据异常值包括0和空值(即NAN)\n",
    "#判断每一列是否有空值并返回每列空值的数量\n",
    "print(train1_frame.isnull().sum())\n",
    "print(train1_frame)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75d469ff",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2) 对每一列的缺失值和异常值进行替换操作</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "891fed76",
   "metadata": {},
   "outputs": [],
   "source": [
    "#先将0替换为空值，之后可考虑将空值替换为平均值。\n",
    "#利用pandas中的replace函数将某一列的指定值替换为另一个值\n",
    "train1_frame[['height', 'weight']]=train1_frame[['height', 'weight']].replace(0, np.NaN)\n",
    "print(train1_frame)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d42546a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "#替换空值为平均值\n",
    "# 先分离 height 列和weight列\n",
    "height_column = train1_frame['height']\n",
    "weight_column = train1_frame['weight']\n",
    "# 按性别分类并计算每组的平均身高和平均体重\n",
    "mean_height_by_gender = train1_frame.groupby('sex')['height'].transform('mean')\n",
    "mean_weight_by_gender = train1_frame.groupby('sex')['weight'].transform('mean')\n",
    "# 使用每个性别组的平均值来替换缺失值\n",
    "train1_frame['height'].fillna(mean_height_by_gender, inplace=True)\n",
    "train1_frame['weight'].fillna(mean_weight_by_gender, inplace=True)\n",
    "# 打印结果\n",
    "print(train1_frame)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0ea36f6",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 将标记值从字符串变成容易操作的整数类型</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8341000e",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(train1_frame)\n",
    "train1_frame['sex'] = np.where(train1_frame['sex'] == \"Male\", 0, 1)\n",
    "print(train1_frame)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac00368d",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4) 可视化分类决策边界</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f4314e6f",
   "metadata": {},
   "outputs": [],
   "source": [
    "#确定图画边界和大小\n",
    "plt.figure(figsize=(10,5))\n",
    "x_min, x_max = 0,10\n",
    "y_min, y_max = 0,5\n",
    "#使用numpy中的meshgrid生成网格矩阵，方便进行之后的描点\n",
    "boundary_x, boundary_y = np.meshgrid(np.arange(x_min, x_max, 0.01),np.arange(y_min, y_max, 0.01))\n",
    "grid = np.c_[boundary_x.ravel(), boundary_y.ravel()]\n",
    "#加入偏置(或w_0)对应的特征为1的一列\n",
    "e=np.ones((len(grid),1))\n",
    "grid=np.c_[e,grid]\n",
    "#假定下列的模型参数\n",
    "w=np.array([[0],[3],[-6]])\n",
    "#计算出网格点中每个点对应的逻辑回归预测值\n",
    "z=grid.dot(w)\n",
    "for i in range(len(z)):\n",
    "    z[i][0]=(1/(1+np.exp(-z[i][0])))\n",
    "    if(z[i][0]<0.5):z[i][0]=0\n",
    "    else:z[i][0]=1\n",
    "#转换shape以作出决策边界\n",
    "z=z.reshape(boundary_x.shape)\n",
    "plt.contourf(boundary_x, boundary_y, z, cmap=plt.cm.Spectral, zorder=1)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ab1eff8",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">5) 散点图</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c07701a",
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"\n",
    "可使用plt.scatter来绘制出测试集的每个样本点，并设置指定颜色来区分预测正确和错误的样本\n",
    "plt.scatter(x,y,c=\"color\")，x、y为坐标值，c为指定颜色\n",
    "\"\"\"\n",
    "class_1=train1_frame[train1_frame['sex']==1]\n",
    "class_0=train1_frame[train1_frame['sex']==0]\n",
    "plt.scatter(class_1['height'],class_1['weight'],c='blue')\n",
    "plt.scatter(class_0['height'],class_0['weight'],c='red')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "657beb0c",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第二部分:逻辑回归</font>**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "226badd7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import math\n",
    "import matplotlib as mpl"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a0bdb6f",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">Iris数据集（鸢尾花卉数据集）是常用的分类实验数据集,是一类多重变量分析的数据集。我们实验选取数据集的部分内容，包含训练集80个数据样本和测试20个样本。每个数据有2个属性：花萼长度($x_1$)，花萼宽度($x_2$)。通过这2个属性预测鸢尾花卉属于（Setosa，Versicolour）二个种类中的哪一类。</span>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70d2906a",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 使用pandas库将训练数据集'flower_train.csv'与测试数据集'flower_test.csv'载入到Dataframe对象中。判断训练集中每列数据是否有缺失值或者不合理的数值，如果有，请进行数据预处理。测试集为完好的数据集，不需要进行预处理。由于花卉类型(type)为字符串类型，请将训练集和测试集中花卉类型转换为适合模型训练的类型</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "f59640f4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     x1   x2             type\n",
      "0   NaN  3.5      Iris-setosa\n",
      "1   4.9  3.0      Iris-setosa\n",
      "2   4.7  3.2      Iris-setosa\n",
      "3   4.6  3.1      Iris-setosa\n",
      "4   5.0  3.6      Iris-setosa\n",
      "..  ...  ...              ...\n",
      "75  5.7  NaN  Iris-versicolor\n",
      "76  5.7  2.9  Iris-versicolor\n",
      "77  6.2  2.9  Iris-versicolor\n",
      "78  5.1  NaN  Iris-versicolor\n",
      "79  5.7  2.8  Iris-versicolor\n",
      "\n",
      "[80 rows x 3 columns]\n",
      "x1      5\n",
      "x2      6\n",
      "type    0\n",
      "dtype: int64\n",
      "          x1        x2             type\n",
      "0   5.032432  3.500000      Iris-setosa\n",
      "1   4.900000  3.000000      Iris-setosa\n",
      "2   4.700000  3.200000      Iris-setosa\n",
      "3   4.600000  3.100000      Iris-setosa\n",
      "4   5.000000  3.600000      Iris-setosa\n",
      "..       ...       ...              ...\n",
      "75  5.700000  2.748649  Iris-versicolor\n",
      "76  5.700000  2.900000  Iris-versicolor\n",
      "77  6.200000  2.900000  Iris-versicolor\n",
      "78  5.100000  2.748649  Iris-versicolor\n",
      "79  5.700000  2.800000  Iris-versicolor\n",
      "\n",
      "[80 rows x 3 columns]\n",
      "x1      0\n",
      "x2      0\n",
      "type    0\n",
      "dtype: int64\n",
      "          x1        x2  type\n",
      "0   5.032432  3.500000     0\n",
      "1   4.900000  3.000000     0\n",
      "2   4.700000  3.200000     0\n",
      "3   4.600000  3.100000     0\n",
      "4   5.000000  3.600000     0\n",
      "..       ...       ...   ...\n",
      "75  5.700000  2.748649     1\n",
      "76  5.700000  2.900000     1\n",
      "77  6.200000  2.900000     1\n",
      "78  5.100000  2.748649     1\n",
      "79  5.700000  2.800000     1\n",
      "\n",
      "[80 rows x 3 columns]\n",
      "     x1   x2  type\n",
      "0   5.0  3.5     0\n",
      "1   4.5  2.3     0\n",
      "2   4.4  3.2     0\n",
      "3   5.0  3.6     0\n",
      "4   5.1  3.8     0\n",
      "5   4.8  3.0     0\n",
      "6   5.1  3.7     0\n",
      "7   4.6  3.2     0\n",
      "8   5.3  3.7     0\n",
      "9   5.0  3.3     0\n",
      "10  7.0  3.2     1\n",
      "11  6.4  3.2     1\n",
      "12  6.9  3.1     1\n",
      "13  5.5  2.3     1\n",
      "14  6.5  2.8     1\n",
      "15  5.7  2.8     1\n",
      "16  6.0  3.5     1\n",
      "17  4.9  2.4     1\n",
      "18  6.6  2.9     1\n",
      "19  5.2  2.7     1\n"
     ]
    }
   ],
   "source": [
    "#your code here------\n",
    "# 将训练数据集'flower_train.csv'与测试数据集'flower_test.csv'载入到Dataframe对象中。\n",
    "train_frame = pd.read_csv(\"flower_train.csv\")\n",
    "test_frame = pd.read_csv(\"flower_test.csv\")\n",
    "\n",
    "print(train_frame)\n",
    "# 判断训练集中每列数据是否有缺失值或者不合理的数值\n",
    "print(train_frame.isnull().sum())\n",
    "\n",
    "# 先将0替换为空值\n",
    "# 利用pandas中的replace函数将某一列的指定值替换为另一个值\n",
    "train_frame[['x1', 'x2']] = train_frame[['x1', 'x2']].replace(0, np.NaN)\n",
    "\n",
    "x1_column = train_frame['x1']\n",
    "x2_column = train_frame['x2']\n",
    "\n",
    "# 将空值替换为平均值\n",
    "# 按种类分类并计算每组的平均花萼长度和花萼宽度\n",
    "mean_x1_by_type = train_frame.groupby('type')['x1'].transform('mean')\n",
    "mean_x2_by_type = train_frame.groupby('type')['x2'].transform('mean')\n",
    "# 使用每个种类组的平均值来替换缺失值\n",
    "train_frame['x1'].fillna(mean_x1_by_type, inplace=True)\n",
    "train_frame['x2'].fillna(mean_x2_by_type, inplace=True)\n",
    "\n",
    "# 查看是否还有缺失值或不合理的数值\n",
    "print(train_frame)\n",
    "print(train_frame.isnull().sum())\n",
    "\n",
    "# 将标记值从字符串变成容易操作的整数类型\n",
    "train_frame['type'] = np.where(train_frame['type'] == 'Iris-setosa', 0, 1)\n",
    "test_frame['type'] = np.where(test_frame['type'] == 'Iris-setosa', 0, 1)\n",
    "print(train_frame)\n",
    "print(test_frame)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75b2fd55",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2)线性模型为$y=\\omega^T x$，在这里，我们将偏置量b当成模型参数$w_0$，并额外引入$x_0=1$这一特征。请相应地往测试集和训练集添加$x_0=1$这一特征。\n",
    " \n",
    "tips:上一次实验中的矩阵求解析解的方法中，需要往特征中加入一列全1的特征量，此处类似。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "6c2c4001",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    x0        x1        x2  type\n",
      "0    1  5.032432  3.500000     0\n",
      "1    1  4.900000  3.000000     0\n",
      "2    1  4.700000  3.200000     0\n",
      "3    1  4.600000  3.100000     0\n",
      "4    1  5.000000  3.600000     0\n",
      "..  ..       ...       ...   ...\n",
      "75   1  5.700000  2.748649     1\n",
      "76   1  5.700000  2.900000     1\n",
      "77   1  6.200000  2.900000     1\n",
      "78   1  5.100000  2.748649     1\n",
      "79   1  5.700000  2.800000     1\n",
      "\n",
      "[80 rows x 4 columns]\n",
      "    x0   x1   x2  type\n",
      "0    1  5.0  3.5     0\n",
      "1    1  4.5  2.3     0\n",
      "2    1  4.4  3.2     0\n",
      "3    1  5.0  3.6     0\n",
      "4    1  5.1  3.8     0\n",
      "5    1  4.8  3.0     0\n",
      "6    1  5.1  3.7     0\n",
      "7    1  4.6  3.2     0\n",
      "8    1  5.3  3.7     0\n",
      "9    1  5.0  3.3     0\n",
      "10   1  7.0  3.2     1\n",
      "11   1  6.4  3.2     1\n",
      "12   1  6.9  3.1     1\n",
      "13   1  5.5  2.3     1\n",
      "14   1  6.5  2.8     1\n",
      "15   1  5.7  2.8     1\n",
      "16   1  6.0  3.5     1\n",
      "17   1  4.9  2.4     1\n",
      "18   1  6.6  2.9     1\n",
      "19   1  5.2  2.7     1\n"
     ]
    }
   ],
   "source": [
    "#your code here------\n",
    "# 额外引入𝑥0=1这一特征，\n",
    "train_frame.insert(0, 'x0', 1)\n",
    "test_frame.insert(0, 'x0', 1)\n",
    "print(train_frame)\n",
    "print(test_frame)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fec5c2ed",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 由于逻辑回归的原理是用逻辑函数把线性回归的结果(-∞,∞)映射到(0,1)所以逻辑回归适合于二分类问题。我们使用sigmoid函数$g(z)=\\frac{1}{1+e^{-z}}$将把线性回归的结果从(-∞,∞)映射到(0,1)。  \n",
    "    假设模型为线性回归模型$\\hat{y}=\\omega_0+\\omega_1 x_1+\\omega_2 x_2+...+\\omega_n x_n=\\omega^T x$，则在给定样本特征$x$,其标记$y=1$的概率为$$P(y=1|x,w)=\\frac{1}{1+e^{-\\omega^T x}}$$  \n",
    "    对应于任意一个样本(${x_i}$,$y_i$),其中$x_i$为特征值，$y_i$为实际标记值,在参数$\\omega$下，该样本发生的概率为$$P(y_i|x_i,\\omega)=y_i{P(y_i=1|x_i,\\omega)}+({1-y_i}){P(y_i=0|x_i,\\omega)}$$\n",
    "    将每个样本发生概率相乘，得到似然函数:$$\\Pi^m_{i=1}{P(y_i|x_i,\\omega)}$$\n",
    "    为了计算方便，一般取对数得到对数似然函数:$$L(\\omega)=\\sum^m_{i=1}{lnP(y_i|x_i,\\omega)}$$  \n",
    "    我们总是希望出现预测正确的概率的可能性最大，即想要得到极大化似然函数对应的参数$\\omega$。我们进一步将最大化似然函数就转变为最小化似然函数的负数，以方便使用梯度下降法进行求解。取负的平均对数似然函数为损失函数,通过这样构建的损失函数$$J(\\omega)=-\\frac{1}{m}\\sum^m_{i=1}{lnP(y_i|x_i,\\omega)}=-\\frac{1}{m}\\sum^m_{i=1}ln(y_i\\frac{1}{1+e^{-\\omega^T x_i}}+(1-y_i)\\frac{e^{-\\omega^T x_i}}{1+e^{-\\omega^T x_i}})$$  \n",
    "    手动实现梯度下降法(不使用机器学习框架，如PyTorch、TensorFlow等)来进行模型的训练。  </span>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1a81c73",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">算法步骤如下：①初始化模型参数$\\omega$的值；②在负梯度的方向上更新参数(由于该实验涉及样本数量较小，建议使用批量梯度下降)，并不断迭代这一步骤。</span>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e922340",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">其中梯度的下降偏导公式为\n",
    "    $$\\frac{\\partial J}{\\partial \\omega_j}=\\frac{1}{m}\\sum_{i=1}^m x_{ij}(\\frac{e^{\\omega^T x_i}}{1+e^{\\omega^T x_i}}-y_i)$$  \n",
    "    参数更新的公式为$$\\omega_j =\\omega_j-\\eta\\frac{\\partial J}{\\partial w_j}$$其中$\\eta$表示学习率，$m$则表示批量中的样本数量，$x_{ij}$代表着第i个样本的第j个特征值,$y_i$代表着第i个样本的真实值</span>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "8a92473b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.15799500857073695, 4.092624517144395, -7.081496974628665]\n"
     ]
    }
   ],
   "source": [
    "#your code here------\n",
    "def batch_gradient_descent(omega_init, sample, learning_rate, threshold):\n",
    "    def omega_update(omega):\n",
    "        while True:\n",
    "            omega_new = omega.copy()\n",
    "            total_list = [0] * len(sample)\n",
    "            for i in range(len(sample)):\n",
    "                wx = omega[0] * sample[i][0] + omega[1] * sample[i][1] + omega[2] * sample[i][2]\n",
    "                ewx = math.exp(wx)\n",
    "                total_list[i] = (ewx / (1 + ewx) - sample[i][3])\n",
    "            for i in range(3):\n",
    "                s = 0\n",
    "# 梯度的下降偏导公式\n",
    "                for j, total in enumerate(total_list):\n",
    "                    s += total * sample[j][i]\n",
    "# 参数更新的公式                \n",
    "                omega_new[i] = omega_new[i] - learning_rate * s / len(sample)\n",
    "# 当omega_new和omega的变化程度小于threshold时结束迭代\n",
    "            if all(abs(x - y) < threshold for x, y in zip(omega_new, omega)):\n",
    "                break\n",
    "            omega = omega_new\n",
    "        return omega\n",
    "    \n",
    "    omega = omega_update(omega_init)\n",
    "    return omega\n",
    "\n",
    "# 初始化模型参数omega的值\n",
    "omega_init = [1, 1, 1]\n",
    "# 学习率设置为0.01\n",
    "learning_rate = 0.01\n",
    "# 阈值设置为0.001\n",
    "threshold = 0.0001\n",
    "\n",
    "train = np.array(train_frame)\n",
    "omega = batch_gradient_descent(omega_init, train, learning_rate, threshold)\n",
    "print(omega)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b24f2e7e",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4)在模型训练完成后得到所训练的模型参数$\\omega$，在测试集上进行所训练模型的测试并使用之前所介绍的损失函数计算loss值</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "a4253a5c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.1764902843715745\n"
     ]
    }
   ],
   "source": [
    "#your code here------\n",
    "test = np.array(test_frame)\n",
    "# 在模型训练完成后得到所训练的模型参数omega，在测试集上进行所训练模型的测试\n",
    "trained = [omega[0] + omega[1] * x[1] + omega[2] * x[2] for x in test]\n",
    "\n",
    "# 使用损失函数计算loss值\n",
    "loss = 0\n",
    "for i in range(len(test)):\n",
    "    loss += math.log(test[i][3] / (1 + math.exp(-trained[i])) + (1 - test[i][3]) * math.exp(-trained[i]) / (1 + math.exp(-trained[i])))\n",
    "loss = -loss / len(test)\n",
    "\n",
    "print(loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd339060",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">5)使用训练后的逻辑回归模型对测试数据集'flower_test.csv'进行预测，输出可视化结果（比如用seaborn或者matplotlib等可视化库来画出测试数据的散点图以及训练好的模型函数图像)，要求如下:  \n",
    "    1.将所得到的逻辑回归模型所得到的决策边界绘制出来  \n",
    "  2.测试集的所有点在同一幅图中进行绘制  \n",
    "  3.需要给不同类别的测试点不同颜色，方便通过颜色的区别直观看到预测正确和错误的样本</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "3597e87c",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAy0AAAGsCAYAAADQY0hSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAsJUlEQVR4nO3dfXCU53kv/kvCluQXIR0ZG3DBGTVOiTFVFCCTI0wycEj9G6F6yqGTTkPquO1MM6mwMRXTY9i06bgTm5bSJv3FYwqZkHbaca3pUDs6nYlPGAe/xD5uXMCYMT2kbn0KTjDYUSxeAiKCPX8IrS2EQC+7++zu8/nMaKJ99Cx7LV4Tvr7v+7qqstlsNgAAAEpUddIFAAAAXI7QAgAAlDShBQAAKGlCCwAAUNKEFgAAoKQJLQAAQEkTWgAAgJJ2VbFf8Pz58/GjH/0o6uvro6qqqtgvDwAAlIhsNhsnTpyIm2++OaqrR19PKXpo+dGPfhSzZ88u9ssCAAAl6vDhwzFr1qxRf1700FJfXx8REU/8wn+L66YU/eUBAEiJB/+/30m6BK5goP+n8c9b7sllhNEUPTUMbQm7bspVcd2Uq4v98gAAVLhF+9fFkvWni/8XXSbsSsdG/LMEAKAi1O1aGV2bZ0SsP510KeSZ0AIAQFlr294SS3csjticdCUUitACAEBZam0fiGs+PX8wsFDRhBYAAMpO3a6VsXzzjIgdSVdCMQgtAACUje6tq2JfT6OtYCkz+gQXAAAoEa3tA+8FFlLHSgsAACWrbXtL7G2+dXArWE/S1ZAUoQUAgJKU6wpG6gktAACUlNb2gVhevcYhe3KEFgAASsLQJHu4mIP4AAAkrnvrKoGFUVlpAQAgMbnVFYfsuQyhBQCAossdsre6whgILQAAFFWmo9Mhe8ZFaAEAoCgyHZ1Jl0CZEloAACioul0ro2vzjKTLoIwJLQAAFMTQNHuBhckSWgAAyLvurasis6Mx6TKoEEILAAB5kzu3ooUxeSS0AAAwabkWxlAAQgsAABO2aP+6uP/FI7aCUVBCCwAA49baPhDLq9dcGA7ZmHQ5VLjqpAsAAKC8tG1vGQwsUCRWWgAAGJPcvBXT7CkyoQUAgMvKhZXNSVdCWtkeBgDAJbW2D5hmT0mw0gIAwAiL9q+LJetPW12hJAgtAAAMk+novNAVDEqD0AIAQES8b5o9lBihBQAgxdq2t8Tan82LfT2NSZcCoxJaAABSKnduBUqc0AIAkDLDp9lD6RNaAABSIhdWoMwILQAAKeCQPeVMaAEAqGCGQ1IJhBYAgApkOCSVpDrpAgAAyJ/W9gFdwag4VloAACpE99ZVkelp1BWMiiO0AACUudwh+55k64BCEVoAAMpU2/aWWLpjcdJlQMEJLQAAZaZte0vsbb41luoKRkoILQAAZSTT0RmxI+kqoLiEFgCAMmCaPWkmtAAAlDDti8GcFgCAktW9dZXAAmGlBQCg5ORWV7QwhogQWgAASkauhbHVFRhGaAEAKAG6gsHohBYAgATlptkDo3IQHwCgyFrbB6Ju10qBBcbISgsAQBHlzq1sTroSKB9CCwBAkTi3AhMjtAAAFJBJ9jB5zrQAABRI99ZVAgvkgZUWAIA8W7R/Xdz/4pHY19OYdClQEYQWAIA8yW0FW386IhqTLgcqhu1hAAB50La9xVYwKBArLQAAk1C3a2V0bZ6hKxgUkNACADABubBi3goUnO1hAADjMDTNvmvzjKRLgdSw0gIAMEaL9q+LJetPW12BIhNaAACuoHvrqsH2xetPJ10KpJLQAgAwilwL456kK4F0E1oAAC4yNBwyYzgklAShBQDgfXLnVgyHhJIhtAAAxOBwyKU7Fju3AiVIaAEAUi0XVgyHhJIltAAAqZXp6BRWoAwILQBA6uRaGANlQWgBAFKhtX0grt30wOAhey2MoaxUT+bJGzdujKqqqli7dm2eygEAyL+heStLHLKHsjTh0PLyyy/Htm3boqWlJZ/1AADkVffWVYMDIoGyNaHtYSdPnozPfvaz8fWvfz2+/OUv57smAIBJy3R0Dn5jKxiUvQmttKxevTo6OjriU5/61BXv7e/vj+PHjw/7AgAolLpdK98LLEBFGPdKy+OPPx579uyJl19+eUz3b9y4MR588MFxFwYAMB5t21tib/Ot0bV5RtKlAHk2rtBy+PDhuP/+++M73/lO1NXVjek5GzZsiK6urtzj48ePx+zZs8dXJQDAZZi3ApVtXKFl9+7dcezYsViwYEHu2rlz5+K5556LRx55JPr7+2PKlCnDnlNbWxu1tbX5qRYA4H1y0+yBijau0LJs2bLYv3//sGu/9Vu/FR/+8IfjgQceGBFYAAAKoW7XysFtYFZXIBXGFVrq6+tj3rx5w65dd911ccMNN4y4DgCQb4v2rxuctbI56UqAYprUcEkAgGJobR94L7AAqTOhOS3v98wzz+ShDACAS8uFFYEFUmvSoQUAoBC6t66KfT2NwgogtAAApSfT0WmSPZAjtAAAJcFwSGA0QgsApSWbjamneqNm4Eycvaoujl/XFFFVlXRVFJh5K8DlCC0AlIymviPRfORA1A6cyV3rv6ou3pg5N3obZiZYGYVkmj1wJUILACWhqe9IzDm8Z8T1moEzMefwnjgY8wWXCtLaPhDLq9ckXQZQJsxpASB52Ww0HzkQEREXbwQbetz81oGIbLaoZVEY3VtXCSzAuFhpASBxU0/1DtsSdrGqiKj92ZmYeqo3jl9/Q/EKI68W7V8X9794ZLCNMcA4CC0AJK7mMoFlIvdRWnJbwdafjojGpMsBypDQAkDizl5Vl9f7KB2Zjs6kSwAqgNACQOKOX9cU/VfVRc3AmRFnWiIishFx9uoL7Y8pC7lp9gB5ILQAkLyqqnhj5tyYc3hPZGP4Yfyho/dvzJg7tnkt5rwkqm7XysHhkKbZA3kktABQEnobZsYPf9ocP/fjN0b87Ic3NI+p3bE5L8nJDYfcnHQlQCXS8hiAktDUd+SSgSUi4ud+/EY09R254vPnHN4z4rD+0JyXKz2fiWltH4i6XStNswcKykoLAMm7wpyWbAzOaemdOuPSW70m+3wmJHfI3uoKUGBCCwCJm+ycFnNeiss0e6DYhBYAEjfZOS3mvBTH0HDIjK5gQJEJLQAkbrJzWsx5Kay27S2x9mfzImM4JJAQoQWAxE12Tos5L4WT6woGkCChBYDiGmWOyqTmtORzzgsR8b6wsiPpSgCEFgCK6EpzVA7G/BE/P3t1Xbwx48pzVib7fAblhkMKK0AJEVoAKIqhOSoXG5qjcjDmR2/DzOidOmPCE+0n+/y06966KvZtbky6DIARhBYACm+cc1Qm1ZZ4ss9Pmdb2gbh20wOxZP3piJ6kqwG4NKEFgIIzR6U05eatrD+ddCkAl1WddAEAVD5zVEpP99ZVBkQCZcNKCwAFZ45KaRg2yd5WMKCMCC0AFJw5KpN3Llsd+07dHj8e+C9xw1U/iY9c91pMqTo/5ufX7VoZyzfPKGCFAIUjtABQeOaoTMozfW3x1SOfj7cHbsxdu/Gqt2PtzG2xpOF/X/a5Q9PsdQUDypnQAgAl7Jm+tvji4cyI628P3BBfPJyJh+LhUYNLpqPTvBWgIggtABTeOFseM+hctjq+euTzFx5d/PtSHRHn4y/f+nx8Yuo/D9sqlunoLFaJAEWhexgABTfU8ni0OPL+lse8Z9+p2y9sCRvtd646jv3sxth36vaIGDy3IrAAlchKCwAFp+XxxPx44L+M6b4bHv6NyOz7z4jNBS4IICFWWgAoOC2PJ+aGq34ypvv+9H+eKHAlAMkSWgAouKGWx9lRfp6NiH4tj0f4yHWvxY1XvR0Ro7U2zkZt/elomGVbHVDZhBYACu9Cy+OIGBFctDwe3ZSq87F25rYYPNNycXAZ/J374LIDUeX/zYEK5485AIqit2FmHJw9f8QWsLNX18XB2fOjt2FmQpWVtiUN/zsemv1w/Nz1Px52vbb+TMxdsSdunHM0ocoAisdBfACKprdhZvROnRFTT/VGzcCZOHvVhS1hVlhG1ba9JeqaZ8b/2vRyTHuzKc6erI2a6/ujYVavFRYgNYQWAIqrqiqOX39D0lWUhe6tqyKzozEiIqqqIxpvcXYFSCehBQBKUKajM6In6SoASoPQAgAlom17SyzdsTjpMgBKjtACAAlbtH9d3P/ikdxWMACGE1oAIEGL9q+LJetPR0Rj0qUAlCyhBYDiymZ1D4uI1vaBWF69JmL96aRLASh5QgsARdPUdySajxyI2oEzuWv9V9XFGzPnpmpOS6ajM+kSAMqK0AJAUTT1HYk5h/eMuF4zcCbmHN4TB6PyB0x2b10V+3oaky4DoOwILQAUXjYbzUcORETExRvBqiIiGxHNbx2I3qkzKnKrWN2uldG1eYYWxgATJLQAUHBTT/UO2xJ2saqIqP3ZmZh6qreiBk/mDtlvTroSgPJWnXQBAFS+mssEloncV+pa2weibtfKC13BAJgsKy0AFNzZq+ryel8pyx2yt7oCkDdCC0C5mkzr4CK3HT5+XVP0X1UXNQNnRpxpiRg803L26gt1lKlK7giWPR/R92ZTnD1ZGzXX90fDrN6oKpO9GuVcO/AeoQWgDE2mdXAibYerquKNmXNjzuE9kY3hh/GzF/73jRlzy/IQ/nvDISvT2wenx+tPz42zJ67JXaupPx23LjsQN845mmBlV1bOtQPD+W8NAGVmqHXwxec/hloHN/UdKchzJ6u3YWYcnD1/xBaws1fXxcHZ5dfuuG17S2Q6Ois+sBx4cn6cPXHRP7MTdXHgyfnx9sHpCVV2ZeVcOzCSlRaAcjKZ1sEl0Ha4t2Fm9E6dUdStaYXQtr0llu5YnHQZBZU9H/H603MvPLr0J+bfn54b0z50tOS2W5Vz7cClCS0AZWQyrYNLpu1wVVXZtjXObQXbkXQlhdf3ZtOwbVUjVUX/iWui782maLylt2h1jUU51w5cmtACUEYm0zo4bW2H8yk3HLKCt4Jd7OzJ2rzeV0zlXDtwaUILQBmZTOvgNLUdzqfurati3+bGpMsouprr+/N6XzGVc+3ApQktAGVkMq2D09B2OF9a2wfi2k0PDG4F60m6mmQ0zOqNmvrTFw6yX/oTU1t/Jhpmld72qnKuHbg0x88AysmF1sER77UKHnLF1sGTeW6K1O1aGcur11R0V7CxqKqOuHXZgQuPLv2J+eCyA4kfZM+ej3j3UFMcOzAz3j3UFNnz5VM7MHZWWgDKTG/DzDgY80fMWjl7dV28MePys1Ym89w0yHR0mmT/PjfOORpzV+wZMeuktv5MfLAEZp1caQ5LKdcOjE9VNpu9+D9BFNTx48ejoaEhvnPbnXHdlKuL+dIAlWUyU+0n89wK09o+EMur1yRdRkkrxanyQ3NYBo0cVzp3xZ64cc7RkqwdeM9A/0/jha9+Ovr6+mLq1Kmj3melBaBcTaZ1cBm3Hc6Xtu0tsbf51li+eUbSpZS8quooqdbA453DUkq1AxMjtACQOmkYDlnJzGGB9BFaAEiN3FawFAyHrGTmsED6CC0ApEKmozPpEsgTc1ggfYQWACpabpo9FcMcFkgfoQWA8SuD7mOL9q8bnLWihXHFGZrDMtg9LBuX6h5mDgtUlnH967xly5ZoaWmJqVOnxtSpU6OtrS2+/e1vF6o2AEpQU9+RWHDwuzHv/74Uv/DmKzHv/74UCw5+N5r6jiRdWkQMnlvJdHSmfjhkpRuaw1JTf2bY9dr6M7l2x0DlGNdKy6xZs+JP/uRP4tZbb42IiL/5m7+JX/mVX4m9e/fG7bffXpACASgdTX1HYs7hPSOu1wyciTmH98TBmJ/ogMq6XSu1ME6RG+ccjWkfOmoOC6TAuELLXXfdNezxQw89FFu2bImXXnpJaAGodNlsNB85EBGjTcaIaH7rQPROnVH0rWK5Q/a2gqWOOSyQDhM+03Lu3Ln4h3/4hzh16lS0tbWNel9/f3/097/XveP48eMTfUkAEjT1VG/UDpwZ9edVEVH7szMx9VRv0QZXOmQPkA7jDi379++Ptra2OHPmTFx//fXxxBNPxNy5c0e9f+PGjfHggw9OqkgAkldzmcAykfsmY2iavcACkA7j3vU5Z86ceOWVV+Kll16K3/3d34177rknDhw4MOr9GzZsiL6+vtzX4cOHJ1UwAMk4e1VdXu+bqO6tq2LpjsUCC0CKjHulpaamJncQf+HChfHyyy/HX/7lX8bWrVsveX9tbW3U1ppIC1Dujl/XFP1X1UXNwJlRJmNEnL36QvvjAsl0dEb0FOyXB6BETXpOSzabHXZmBYAKVVUVb8ycG3MO7xllMkbEGzPm5v0Qftv2lli6Y3Fef02KK3s+dPgCJmVcoSWTyUR7e3vMnj07Tpw4EY8//ng888wz8dRTTxWqPgBKSG/DzDgY86P5yIFhh/LPXl0Xb8yYm9d2x4v2r4v7XzwSmR2Nefs1Kb63D06P15+eG2dPXJO7VlN/Om5ddsAsFWDMxhVajh49GnfffXccOXIkGhoaoqWlJZ566qn4pV/6pULVB0CJ6W2YGb1TZ8TUU71RM3Amzl51YUtYHldYctPsozFvvybF9/bB6Rem1g939kRdHHhyviGQwJiNK7R84xvfKFQdAJSTqqqCtDXObQUzzb7sZc9HvP70UHfRS0/2+fen58a0Dx21VQy4okmfaQGAycqtrOxIuhLype/NpmFbwkaqiv4T10Tfm02GQwJXJLQAkKhMR6eVlQp09uTYOoeO9T4g3YQWAIqutX0gNqz4XOzraUy6FAqk5vqxdRYd631AugktABRNa/tAXLvpgcGtYBU4byWtrX0v9b4bZvVGTf3pOHuiLkaeaYmIyEZt/ZlomGVrGHBlQgsARdHaPhDLq9dU7FawtLb2vdz7vnXZgQvdwy492eeDyw6kItQBk+ePCgAKLtPRORhYKtRQa9/BVYX3DLX2ffvg9IQqK6wrve+IiLkr9kRN/ZlhP6+tP6PdMTAuVloAKJhMR2fSJRRcWlv7jvV9f/wLu2Lah46mctsckD9CCwB5V7drZXRtnpF0GUWR1ta+433flfTegeITWgDIm9xwyM1JV1I8aW3tm9b3DSRDaAEgLzIdnakcDpnW1r5pfd9AMoQWACYlN80+pdLa2jet7xtIhmNwAExI99ZVkenoLHpgyZ6PePdQUxw7MDPePdQU2fNFffkRqqojbl124MKj7EU/rdzWvml930AyrLQAMC65lZUEhkOW6iyUG+ccjbkr9oyorbb+THywgue0pPV9A8UntAAwJsOm2SdgaCbIxYZmgiQ99+PGOUdT2do3re8bKC6hBYArqtu1MpZvnpHYNPtymYVSVR2pbO2b1vcNFI/QAsBlZTo6E29hnNZZKAAMEloAGCE3b6VEmAkCkG5CCwA5bdtbYm/zrbG0xKbZmwkCkG5CCwARUXqrK+9nJghAugktACnX2j4Qy6vXlPQ0+6GZIIPdw7IxPLiYCQJQ6YQWgJQqt0n2ZoIApJfQApBC3VtXRaaMAssQM0EA0kloAUiRJKfZ54uZIADpI7QApEDu3EoZrq4AgAV1gApXt2vlYGABgDJlpQWgQmU6Oge/SXiaPQBMltACUGHqdq2MrhIbDgkAkyG0AFSIoWn2AgsAlUZoAShzre0DsWHF5yKzozHpUgCgIIQWgDKWO7dSxi2MAeBKhBaAMlRu0+wBYDKEFoAysmj/urj/xSNlOc0eACZKaAEoA23bW2LpjsUXhkM2Jl0OABSV4ZIAJS4XWAAgpay0AJSo3LyVHUlXMn7Z8xF9bzbF2ZO1UXN9fzTM6o0q/5kMgAkSWgBKUKajs2wn2b99cHq8/vTcOHvimty1mvrTceuyA3HjnKMJVgZAuRJaAErE0LyVfT2NSZcyYW8fnB4Hnpw/4vrZE3Vx4Mn5MXfFHsEFgHGzWA+QsNb2gVi0f10sr15T1oElez7i9afnXnhUddFPBx//+9NzI3u+qGUBUAGstAAkqLV9IJZXr7nQFay89b3ZNGxL2EhV0X/imuh7sykab+ktWl0AlD+hBSAhuWn2FeLsydq83gcAQ4QWgCKq5PbFNdf35/U+ABjiTAtAkdTtWlmxgSUiomFWb9TUn46I7Ch3ZKO2/nQ0zLI1DIDxsdICUGC51ZUitzCuPn8uWt58LZpO9kbv9U3x6qzb43z1lIK9XlV1xK3LDlzoHpaN4YfxB4PMB5cdKOl5LebLAJQmoQWggDIdnYkMh/zEwRfivqe3xU0n3sldO1Y/Lb627PPx/Jw7Cva6N845GnNX7Bkxp6W2/kx8sMTntJgvA1C6qrLZ7Gjr+AVx/PjxaGhoiO/cdmdcN+XqYr40QNHkptkn4BMHX4g/fvLhiBi+1nH+wuMvrcgUNLhElN+KxfD5MiNXiMyXASiMgf6fxgtf/XT09fXF1KlTR73PSgtAHnVvXTU4ayWhafbV58/FfU9vi4iRk1KqYzC43Pv0tnjhQ/+14FvFyqWt8ZXny2Tj35+eG9M+dLSkgxdAJfPHL0AeLNq/LjIdnYkPh2x587W46cQ7I/7qPaQ6IqafeCda3nytmGWVtPfmy4z2u/befBkAkiG0AEzC0DT7JSUyHLLp5NhWN8Z6XxqYLwNQ+mwPA5ig7q2rItPTWFLT7HuvH9tqwFjvSwPzZQBKn9ACME65SfY9ydZxKa/Ouj2O1U+LaSfeueRS+vmIeLt+Wrw66/Zil1ayhubLnD1RF5feIpaN2voz5ssAJMj2MIAxatve8l5gKVHnq6fE15Z9PqpiMKAM+1kM/pX8kWWfL+gh/HIzNF9m0MUNNctjvgxApfNHMMAVtG1vKatp9s/PuSO+tCIT79RPG3b97fppRWl3XI6G5svU1J8Zdr22/ox2xwAlwPYwgMtIajjkZD0/54544UP/NVrefC2aTvZG7/VN8eqs262wXMaNc47GtA8dLav5MgBpIbQAXEJr+0Asr16TdBmTcr56SrxyS0vSZZSVcpovA5AmQgvA+5RS+2IAYJBFb4ALureuElgAoARZaQFSrbV9IK7d9MBgWCnBFsYAgNACpFhuK5jVFQAoaUILkEqZjk5hBQDKhNACpEqpD4cEAEYSWoCK17a9JfY23xpdm2ckXQoAMAFCC1DRtDAGgPIntAAVKTccUmABgLIntAAVx7kVAKgsQgtQMWwFA4DKJLQAZa9u18rBQ/YCCwBUpOrx3Lxx48b42Mc+FvX19XHTTTfFihUr4uDBg4WqDeCy2ra3RKajU1cwAKhw4wotzz77bKxevTpeeuml2LlzZwwMDMSdd94Zp06dKlR9AJfUtr0llu5YnHQZAEARjGt72FNPPTXs8Te/+c246aabYvfu3fHJT34yr4UBXEpuK9iOpCsBAIplUmda+vr6IiKiqalp1Hv6+/ujv78/9/j48eOTeUkgpbq3rop9PY0Rm5OuBAAotnFtD3u/bDYbXV1dsXjx4pg3b96o923cuDEaGhpyX7Nnz57oSwIp1La95b3AAgCk0oRDy7333huvvvpq/P3f//1l79uwYUP09fXlvg4fPjzRlwRSpLV9IOp2rYylOxYLLACQchPaHnbfffdFT09PPPfcczFr1qzL3ltbWxu1tbUTKg5Ip9w0e1vBAIAYZ2jJZrNx3333xRNPPBHPPPNMNDc3F6ouIIVyYQUA4H3GFVpWr14djz32WHzrW9+K+vr6eOuttyIioqGhIa655pqCFAhUPpPsAYDLGdeZli1btkRfX18sWbIkZs6cmfvq7u4uVH1AhavbtVJgAQAua9zbwwDyITcc0rkVAOAKJjWnBWC8cmHFcEgAYIyEFqBoMh2dwgoAMG5CC1BwhkMCAJMhtAAF0do+ENduemDwkH1P0tUAAOVMaAHyLjdvRVcwACAPxtXyGOBKureuMiASAMgrKy1AXmQ6Oge/sRUMAMgzoQWYFNPsAYBCE1qACWnb3hJrfzYvMgILAFBgQgswbuatAADFJLQAY5abZg8AUERCC3BFdbtWRtfmGVZXAIBECC3AqHKH7DcnXQkAkGbmtAAjtLYP6AoGAJQMKy3AMLmwIrAAACVCaAFyMh2dwgoAUHKEFuC9afYAACVIaIGUatveEnubbx3sCgYAUMKEFkghh+wBgHIitECKtLYPxPLqNc6tAABlRWiBFDDJHgAoZ0ILVLhMR6dJ9gBAWRNaoELV7VrpkD0AUBGEFqgwua1gm5OuBAAgP4QWqCC2ggEAlUhogQrQvXVV7OtpTLoMAICCEFqgjOXCSk/SlQAAFI7QAmWobXtLrP3ZPKsrAEAqCC1QRlrbB+L//I9fi6W6ggEAKSK0QJnIdHQOfqMrGACQMkILlLjW9oFYXr0m6TIAABIjtECJWrR/XSxZfzrpMgAAEie0QIkZOmSfEVgAACJCaIGSkptmDwBAjtACJSC3Fcw0ewCAEYQWSFimozPCVjAAgFEJLZCQ3DR7AAAuS2iBImptH4hrNz0wuBWsJ+lqAADKg9ACRZKbt2IrGADAuAgtUAS5afYAAIyb0AIFYpI9AEB+VCddAFSiul0rBRYAgDyx0gJ5NDTNft/mxqRLAQCoGEIL5EFuK5jhkAAAeSe0wCQ5ZA8AUFhCC0xQ3a6V0bV5RtJlAABUPKEFxmnR/nWDwyE3J10JAEA66B4GY9TaPhB1u1YOBhYAAIrGSguMQd2ulbF88wyrKwAACRBa4DK6t66KfT2NwgoAQIKEFriEXAvjnqQrAQBAaIH3GRoOmelpTLoUAAAuEFogBldWNqz4XGR2NCZdCgAAFxFaSD1bwQAASpvQQmq1bW+JpTsWJ10GAABXILSQSpmOzogdSVcBAMBYCC2kSt2uldG1eUbSZQAAMA5CC6mQ2wpm3goAQNkRWqh4toIBAJQ3oYWKlZtmDwBAWRNaqChD81b29TRqYQwAUCGEFirGov3rYsn608IKAECFqU66AJis1vaB6N66ajCwAABQcay0UNYyHZ2D31hdAQCoWEILZck0ewCA9BBaKCu54ZBaGAMApMa4z7Q899xzcdddd8XNN98cVVVV8eSTTxagLBiubXtLZDo6TbMHAEihcYeWU6dOxUc+8pF45JFHClEPjGArGABAuo17e1h7e3u0t7cXohYYJtfC2FYwAIBUK/iZlv7+/ujv7889Pn78eKFfkjKXm2SvhTEAAFGEOS0bN26MhoaG3Nfs2bML/ZKUsVxgAQCACwoeWjZs2BB9fX25r8OHDxf6JSkzre0DsWj/ush0dAosAACMUPDtYbW1tVFbW1vol6FM1e1aGcs3z7AVDACAUZnTQmIyHZ0Rm5OuAgCAUjfu0HLy5Ml4/fXXc4/feOONeOWVV6KpqSluueWWvBZH5dG+GACA8Rp3aPmXf/mXWLp0ae5xV1dXRETcc8898dd//dd5K4zKU7drZSw1HBIAgHEad2hZsmRJZLPZQtRChcqtrtgKBgDABDjTQsG0tg/E8uo1hkMCADApQgsFkenoTLoEAAAqhNBCXhkOCQBAvgkt5EXdrpXRtXlGRE/SlQAAUGmEFibFIXsAAApNaGHCtDAGAKAYhBbGLXfI3uoKAABFILQwZrkWxgAAUERCC1e0aP+6uP/FI5HRFQwAgAQILYyqtX0gNqz4XGTWn46IxqTLAQAgpYQWLim3FUwLYwAAEia0MMyi/etiyfrTSZcBAAA5QgsR8b7hkAILAAAlpjrpAkheLrAAAEAJstKSYqbZAwBQDoSWFMqtrOxIuhIAALgyoSVlMh2dVlYAACgrQksKDM1b2Wc4JAAAZUhoqXC5FsbmrQAAUKZ0D6tQre0D0b11lZkrAACUPSstFSjT0Tn4jdUVAAAqgNBSQXJhBQAAKojQUgEMhwQAoJIJLWXMcEgAANJAaClTmY5OwyEBAEgFoaXM2AoGAEDaCC1lonvrqsHhkLaCAQCQMua0lLiheSum2QMAkFZWWkpUa/tAXLvpAdPsAQBIPaGlBHVvXRWZnsYI0+wBAEBoKTWZjk4rKwAA8D5CSwnIzVsBAABGEFoStGj/utjzzhuxVAtjAAAYldCSkEX71w0esg+BBQAALkdoKbLW9oFYXr3GIXsAABgjoaVI3ltZAQAAxkNoKYJMR6eVFQAAmCChpYDqdq2MLofsAQBgUoSWAsi1MN6cdCUAAFD+hJY8q9u1UgtjAADII6ElTzIdnYPfWF0BAIC8ElomqXvrqtjX05h0GQAAULGElglatH9d3P/iEYEFAAAKTGgZp+HDIRuTLgcAACqe0DIOuXMrAABA0QgtY2CaPQAAJEdouYzccEiBBQAAElOddAGlyjR7AAAoDVZaLmKaPQAAlBah5YLcysqOpCsBAADeT2iJC13BrKwAAEBJSm1oaW0fiA0rPmc4JAAAlLjUHcRvbR+Iul0rY3n1GoEFAADKQKpWWnLT7G0FAwCAspGK0JILKwAAQNmp+NCS6ehMugQAAGASKja0GA4JAACVoeJCi+GQAABQWSomtOTOrRgOCQAAFaUiWh4PtTAGAAAqT1mvtHRvXTU4a8VWMAAAqFhlGVoW7V8XS9afjuhJuhIAAKDQyiq0tG1vib3Ntw4GFgAAIBXKJrR0b10VmR2NSZcBAAAUWcmHltxwSFvBAAAglUo2tOTmrQAAAKlWcqFl0f51cf+LR2wFAwAAImKCc1oeffTRaG5ujrq6uliwYEE8//zzeSlmqCvYvp7GvPx6AABA+Rt3aOnu7o61a9fGF7/4xdi7d2984hOfiPb29jh06NCEi2jb3hKZjk5dwQAAgBGqstlsdjxP+PjHPx7z58+PLVu25K7ddtttsWLFiti4ceOI+/v7+6O/vz/3uK+vL2655ZZ44hf+WzQ9tTIy///0SZQPAACUq4H+n8Y/b7kn3n333WhoaBj1vnGdaTl79mzs3r071q9fP+z6nXfeGS+++OIln7Nx48Z48MEHR1z/7z/4bsTPf3c8Lw8AAFSgEydO5C+0vPPOO3Hu3LmYPn346sj06dPjrbfeuuRzNmzYEF1dXbnH7777bnzgAx+IQ4cOXbYwmKzjx4/H7Nmz4/DhwzF16tSky6GC+axRLD5rFIvPGsWSzWbjxIkTcfPNN1/2vgl1D6uqqhrxYhdfG1JbWxu1tbUjrjc0NPiXgKKYOnWqzxpF4bNGsfisUSw+axTDWBYyxnUQf9q0aTFlypQRqyrHjh0bsfoCAACQD+MKLTU1NbFgwYLYuXPnsOs7d+6MRYsW5bUwAACAiAlsD+vq6oq77747Fi5cGG1tbbFt27Y4dOhQfOELXxjT82tra+OP/uiPLrllDPLJZ41i8VmjWHzWKBafNUrNuFseRwwOl9y0aVMcOXIk5s2bF1/5ylfik5/8ZCHqAwAAUm5CoQUAAKBYxnWmBQAAoNiEFgAAoKQJLQAAQEkTWgAAgJJW1NDy6KOPRnNzc9TV1cWCBQvi+eefL+bLkwIbN26Mj33sY1FfXx833XRTrFixIg4ePJh0WaTAxo0bo6qqKtauXZt0KVSoH/7wh/Ebv/EbccMNN8S1114bra2tsXv37qTLosIMDAzEH/zBH0Rzc3Ncc8018fM///Pxx3/8x3H+/PmkSyPlihZauru7Y+3atfHFL34x9u7dG5/4xCeivb09Dh06VKwSSIFnn302Vq9eHS+99FLs3LkzBgYG4s4774xTp04lXRoV7OWXX45t27ZFS0tL0qVQoX7yk5/EHXfcEVdffXV8+9vfjgMHDsSf//mfR2NjY9KlUWH+9E//NP7qr/4qHnnkkfjXf/3X2LRpU/zZn/1ZfO1rX0u6NFKuaC2PP/7xj8f8+fNjy5YtuWu33XZbrFixIjZu3FiMEkiht99+O2666aZ49tlnzRKiIE6ePBnz58+PRx99NL785S9Ha2trfPWrX026LCrM+vXr44UXXrBDgYL75V/+5Zg+fXp84xvfyF371V/91bj22mvjb//2bxOsjLQrykrL2bNnY/fu3XHnnXcOu37nnXfGiy++WIwSSKm+vr6IiGhqakq4EirV6tWro6OjIz71qU8lXQoVrKenJxYuXBif/vSn46abboqPfvSj8fWvfz3psqhAixcvjqeffjp+8IMfRETEvn374nvf+14sX7484cpIu6uK8SLvvPNOnDt3LqZPnz7s+vTp0+Ott94qRgmkUDabja6urli8eHHMmzcv6XKoQI8//njs2bMnXn755aRLocL9x3/8R2zZsiW6uroik8nE97///VizZk3U1tbG5z73uaTLo4I88MAD0dfXFx/+8IdjypQpce7cuXjooYfiM5/5TNKlkXJFCS1Dqqqqhj3OZrMjrkG+3HvvvfHqq6/G9773vaRLoQIdPnw47r///vjOd74TdXV1SZdDhTt//nwsXLgwHn744YiI+OhHPxqvvfZabNmyRWghr7q7u+Pv/u7v4rHHHovbb789XnnllVi7dm3cfPPNcc899yRdHilWlNAybdq0mDJlyohVlWPHjo1YfYF8uO+++6Knpyeee+65mDVrVtLlUIF2794dx44diwULFuSunTt3Lp577rl45JFHor+/P6ZMmZJghVSSmTNnxty5c4ddu+2222LHjh0JVUSl+v3f//1Yv359/Pqv/3pERPziL/5i/Od//mds3LhRaCFRRTnTUlNTEwsWLIidO3cOu75z585YtGhRMUogJbLZbNx7773xj//4j/Hd7343mpubky6JCrVs2bLYv39/vPLKK7mvhQsXxmc/+9l45ZVXBBby6o477hjRvv0HP/hBfOADH0ioIirVT3/606iuHv7XwylTpmh5TOKKtj2sq6sr7r777li4cGG0tbXFtm3b4tChQ/GFL3yhWCWQAqtXr47HHnssvvWtb0V9fX1uda+hoSGuueaahKujktTX1484K3XdddfFDTfc4AwVefd7v/d7sWjRonj44Yfj137t1+L73/9+bNu2LbZt25Z0aVSYu+66Kx566KG45ZZb4vbbb4+9e/fGX/zFX8Rv//ZvJ10aKVe0lscRg8MlN23aFEeOHIl58+bFV77yFW1oyavRzkh985vfjN/8zd8sbjGkzpIlS7Q8pmD+6Z/+KTZs2BD/9m//Fs3NzdHV1RW/8zu/k3RZVJgTJ07EH/7hH8YTTzwRx44di5tvvjk+85nPxJe+9KWoqalJujxSrKihBQAAYLyKcqYFAABgooQWAACgpAktAABASRNaAACAkia0AAAAJU1oAQAASprQAgAAlDShBQAAKGlCCwAAUNKEFgAAoKQJLQAAQEn7f0XOLrfSg5WwAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 1000x500 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#your code here------\n",
    "#确定图画边界和大小\n",
    "plt.figure(figsize=(10,5))\n",
    "x_min, x_max = 0,10\n",
    "y_min, y_max = 0,5\n",
    "#使用numpy中的meshgrid生成网格矩阵，方便进行之后的描点\n",
    "boundary_x, boundary_y = np.meshgrid(np.arange(x_min, x_max, 0.01),np.arange(y_min, y_max, 0.01))\n",
    "grid = np.c_[boundary_x.ravel(), boundary_y.ravel()]\n",
    "#加入偏置(或w_0)对应的特征为1的一列\n",
    "e=np.ones((len(grid),1))\n",
    "grid=np.c_[e,grid]\n",
    "#假定下列的模型参数\n",
    "w=np.array([[omega[0]],[omega[1]],[omega[2]]])\n",
    "#计算出网格点中每个点对应的逻辑回归预测值\n",
    "z=grid.dot(w)\n",
    "for i in range(len(z)):\n",
    "    z[i][0] = (1 / (1 + np.exp(-z[i][0])))\n",
    "    if z[i][0]<0.5:\n",
    "        z[i][0]=0\n",
    "    else:\n",
    "        z[i][0]=1\n",
    "#转换shape将所得到的逻辑回归模型所得到的决策边界绘制出来\n",
    "z=z.reshape(boundary_x.shape)\n",
    "plt.contourf(boundary_x, boundary_y, z, cmap=plt.cm.Spectral, zorder=1)\n",
    "\n",
    "# 测试集的所有点在同一幅图中进行绘制\n",
    "class_1 = test_frame[test_frame['type'] == 1]\n",
    "class_0 = test_frame[test_frame['type'] == 0]\n",
    "# 给type == 1的测试点蓝色，给type == 0的测试点红色，通过颜色的区别直观看到预测正确和错误的样本\n",
    "plt.scatter(class_1['x1'],class_1['x2'],c='blue')\n",
    "plt.scatter(class_0['x1'],class_0['x2'],c='red')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b869bee1",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第三部分:逻辑回归实验二</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf3a5a55",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">该数据集(train_titanic.csv和test_titanic.csv)同样为分类数据集，为泰坦尼克号的乘客信息以及最后是否生还。每个包括了七个特征值以及标记(代表是否生还),特征信息分别为Passengerid(乘客id)，Age(乘客年龄)，Fare(船票价格),Sex(性别)，sibsp(堂兄弟妹个数)，Parch(父母与小孩的个数)，Pclass(乘客等级)</span>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4a492fe",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">该数据集已经做了处理，无缺失值和空值，且字符串类型全部转换成了整数类型，你们需要进行判断，在七个特征值至少选择四个你认为与最后是否生还关联度高的特征类别。该实验的任务依然是在训练集上使用逻辑回归方法和手动实现的梯度下降方法完成模型训练。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7ebd3c1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入机器学习常用的库\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import math"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "915ff106",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 使用pandas库将训练数据集'train_titanic.csv'与测试数据集'test_titanic.csv'载入到Dataframe对象中</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "7056ffde",
   "metadata": {},
   "outputs": [],
   "source": [
    "#your code here------\n",
    "# 使用pandas库将训练集'train_titanic.csv'与测试数据集'test_titanic.csv'载入到Dataframe对象中\n",
    "train_frame = pd.read_csv(\"train_titanic.csv\")\n",
    "test_frame = pd.read_csv(\"test_titanic.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9abde442",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2)利用上个实验所使用的梯度下降方法(由于该数据集样本数量较大，所以建议使用随机批量或小批量)进行模型训练</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "55e62317",
   "metadata": {},
   "outputs": [
    {
     "ename": "OverflowError",
     "evalue": "math range error",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mOverflowError\u001b[0m                             Traceback (most recent call last)",
      "Cell \u001b[1;32mIn[3], line 50\u001b[0m\n\u001b[0;32m     47\u001b[0m threshold \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0.0001\u001b[39m\n\u001b[0;32m     49\u001b[0m train \u001b[38;5;241m=\u001b[39m np\u001b[38;5;241m.\u001b[39marray(train_frame)\n\u001b[1;32m---> 50\u001b[0m omega \u001b[38;5;241m=\u001b[39m batch_gradient_descent(omega_init, train, learning_rate, threshold)\n\u001b[0;32m     51\u001b[0m \u001b[38;5;28mprint\u001b[39m(omega)\n",
      "Cell \u001b[1;32mIn[3], line 41\u001b[0m, in \u001b[0;36mbatch_gradient_descent\u001b[1;34m(omega_init, sample, learning_rate, threshold)\u001b[0m\n\u001b[0;32m     38\u001b[0m         omega \u001b[38;5;241m=\u001b[39m omega_new\n\u001b[0;32m     39\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m omega\n\u001b[1;32m---> 41\u001b[0m omega \u001b[38;5;241m=\u001b[39m omega_update(omega_init)\n\u001b[0;32m     42\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m omega\n",
      "Cell \u001b[1;32mIn[3], line 23\u001b[0m, in \u001b[0;36mbatch_gradient_descent.<locals>.omega_update\u001b[1;34m(omega)\u001b[0m\n\u001b[0;32m     15\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m i \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mrange\u001b[39m(\u001b[38;5;28mlen\u001b[39m(sample)):\n\u001b[0;32m     16\u001b[0m     wx \u001b[38;5;241m=\u001b[39m (\n\u001b[0;32m     17\u001b[0m         omega[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m*\u001b[39m sample[i][\u001b[38;5;241m0\u001b[39m]\n\u001b[0;32m     18\u001b[0m         \u001b[38;5;241m+\u001b[39m omega[\u001b[38;5;241m1\u001b[39m] \u001b[38;5;241m*\u001b[39m sample[i][\u001b[38;5;241m1\u001b[39m]\n\u001b[1;32m   (...)\u001b[0m\n\u001b[0;32m     21\u001b[0m         \u001b[38;5;241m+\u001b[39m omega[\u001b[38;5;241m4\u001b[39m] \u001b[38;5;241m*\u001b[39m sample[i][\u001b[38;5;241m4\u001b[39m]\n\u001b[0;32m     22\u001b[0m     )\n\u001b[1;32m---> 23\u001b[0m     ewx \u001b[38;5;241m=\u001b[39m math\u001b[38;5;241m.\u001b[39mexp(wx)\n\u001b[0;32m     24\u001b[0m     \u001b[38;5;66;03m# print(ewx)\u001b[39;00m\n\u001b[0;32m     25\u001b[0m     total_list[i] \u001b[38;5;241m=\u001b[39m ewx \u001b[38;5;241m/\u001b[39m (\u001b[38;5;241m1\u001b[39m \u001b[38;5;241m+\u001b[39m ewx) \u001b[38;5;241m-\u001b[39m sample[i][\u001b[38;5;241m5\u001b[39m]\n",
      "\u001b[1;31mOverflowError\u001b[0m: math range error"
     ]
    }
   ],
   "source": [
    "#your code here------\n",
    "# 相应地往测试集和训练集添加𝑥0=1这一特征\n",
    "train_frame.insert(0, \"x0\", 1)\n",
    "test_frame.insert(0, \"x0\", 1)\n",
    "\n",
    "# 批量梯度下降\n",
    "def batch_gradient_descent(omega_init, sample, learning_rate, threshold):\n",
    "\n",
    "    def omega_update(omega):\n",
    "        while True:\n",
    "            omega_new = omega.copy()\n",
    "            # print(omega_new)\n",
    "            total_list = [0] * len(sample)\n",
    "            # print(\"hahaha\")\n",
    "            for i in range(len(sample)):\n",
    "                wx = (\n",
    "                    omega[0] * sample[i][0]\n",
    "                    + omega[1] * sample[i][1]\n",
    "                    + omega[2] * sample[i][2]\n",
    "                    + omega[3] * sample[i][3]\n",
    "                    + omega[4] * sample[i][4]\n",
    "                )\n",
    "                ewx = math.exp(wx)\n",
    "                # print(ewx)\n",
    "                total_list[i] = ewx / (1 + ewx) - sample[i][5]\n",
    "            # print(total_list)\n",
    "            for i in range(5):\n",
    "                s = 0\n",
    "# 梯度的下降偏导公式\n",
    "                for j, total in enumerate(total_list):\n",
    "                    s += total * sample[j][i]\n",
    "# 参数更新的公式\n",
    "                omega_new[i] = omega_new[i] - learning_rate * s / len(sample)\n",
    "            if all(abs(x - y) < threshold for x, y in zip(omega_new, omega)):\n",
    "                break\n",
    "            omega = omega_new\n",
    "        return omega\n",
    "\n",
    "    omega = omega_update(omega_init)\n",
    "    return omega\n",
    "\n",
    "# 初始化模型参数omega的值\n",
    "omega_init = [1, 1, 1, 1, 1]\n",
    "# 学习率设置为0.01\n",
    "learning_rate = 0.01\n",
    "# 阈值设置为0.001\n",
    "threshold = 0.0001\n",
    "\n",
    "train = np.array(train_frame)\n",
    "omega = batch_gradient_descent(omega_init, train, learning_rate, threshold)\n",
    "print(omega)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95f698bc",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 使用训练后的逻辑回归模型对测试数据集'test_titanic.csv'进行预测，并计算其loss值：</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4dd020b7",
   "metadata": {},
   "outputs": [],
   "source": [
    "#your code here------\n",
    "# 使用训练后的逻辑回归模型对测试数据集'test_titanic.csv'进行预测\n",
    "test = np.array(test_frame)\n",
    "trained = [\n",
    "    omega[0] + omega[1] * x[1] + omega[2] * x[2] + omega[3] * x[3] + omega[4] * x[4]\n",
    "    for x in test\n",
    "]\n",
    "\n",
    "# 使用损失函数计算loss值\n",
    "loss = 0\n",
    "for i in range(len(test)):\n",
    "    loss += math.log(\n",
    "        test[i][5] / (1 + math.exp(-trained[i]))\n",
    "        + (1 - test[i][5]) * math.exp(-trained[i]) / (1 + math.exp(-trained[i]))\n",
    "    )\n",
    "loss = -loss / len(test)\n",
    "\n",
    "print(loss)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "354bddd7",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第四部分:作业提交链接</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35a454a5",
   "metadata": {},
   "source": [
    "1.实验报告提交链接(有效期直至9.22 14:20): https://send2me.cn/uohF8xm-/SvKmbAQRlFSKqQ\n",
    "\n",
    "实验报告需包含：内容，代码，解释，结果，总结"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be05c034",
   "metadata": {},
   "source": [
    "2.实验课件获取链接: https://www.jianguoyun.com/p/DZzHw20Qp5WhChjwoZwFIAA"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
